44 research outputs found
Explaining Neural Matrix Factorization with Gradient Rollback
Explaining the predictions of neural black-box models is an important
problem, especially when such models are used in applications where user trust
is crucial. Estimating the influence of training examples on a learned neural
model's behavior allows us to identify training examples most responsible for a
given prediction and, therefore, to faithfully explain the output of a
black-box model. The most generally applicable existing method is based on
influence functions, which scale poorly for larger sample sizes and models.
We propose gradient rollback, a general approach for influence estimation,
applicable to neural models where each parameter update step during gradient
descent touches a smaller number of parameters, even if the overall number of
parameters is large. Neural matrix factorization models trained with gradient
descent are part of this model class. These models are popular and have found a
wide range of applications in industry. Especially knowledge graph embedding
methods, which belong to this class, are used extensively. We show that
gradient rollback is highly efficient at both training and test time. Moreover,
we show theoretically that the difference between gradient rollback's influence
approximation and the true influence on a model's behavior is smaller than
known bounds on the stability of stochastic gradient descent. This establishes
that gradient rollback is robustly estimating example influence. We also
conduct experiments which show that gradient rollback provides faithful
explanations for knowledge base completion and recommender datasets.Comment: 35th AAAI Conference on Artificial Intelligence, 2021. Includes
Appendi
Sensor-based human activity recognition: Overcoming issues in a real world setting
The rapid growing of the population age in industrialized societies calls for advanced tools to continuous monitor the activities of people. The goals of those tools are usually to support active and healthy ageing, and to early detect possible health issues to enable a long and independent life. Recent advancements in sensor miniaturization and wireless communications have paved the way to unobtrusive activity recognition systems. Hence, many pervasive health care systems have been proposed which monitor activities through unobtrusive sensors and by machine learning or artificial intelligence methods. Unfortunately, while those systems are effective in controlled environments, their actual effectiveness out of the lab is still limited due to different shortcomings of existing approaches.
In this work, we explore such systems and aim to overcome existing limitations and shortcomings. Focusing on physical movements and crucial activities, our goal is to develop robust activity recognition methods based on external and wearable sensors that generate high quality results in a real world setting. Under laboratory conditions, existing research already showed that wearable sensors are suitable to recognize physical activities while external sensors are promising for activities that are more complex. Consequently, we investigate problems that emerge when coming out of the lab. This includes the position handling of wearable devices, the need of large expensive labeled datasets, the requirement to recognize activities in almost real-time, the necessity to adapt deployed systems online to changes in behavior of the user, the variability of executing an activity, and to use data and models across people. As a result, we present feasible solutions for these problems and provide useful insights for implementing corresponding techniques. Further, we introduce approaches and novel methods for both external and wearable sensors where we also clarify limitations and capabilities of the respective sensor types. Thus, we investigate both types separately to clarify their contribution and application use in respect of recognizing different types of activities in a real world scenario.
Overall, our comprehensive experiments and discussions show on the one hand the feasibility of physical activity recognition but also recognizing complex activities in a real world scenario. Comparing our techniques and results with existing works and state-of-the-art techniques also provides evidence concerning the reliability and quality of the proposed techniques. On the other hand, we also identify promising research directions and highlight that combining external and wearable sensors seem to be the next step to go beyond activity recognition. In other words, our results and discussions also show that combining external and wearable sensors would compensate weaknesses of the individual sensors in respect of certain activity types and scenarios. Therefore, by addressing the outlined problems, we pave the way for a hybrid approach. Along with our presented solutions, we conclude our work with a high-level multi-tier activity recognition architecture showing that aspects like physical activity, (emotional) condition, used objects, and environmental features are critical for reliable recognizing complex activities
LODE: Linking Digital Humanities Content to the Web of Data
Numerous digital humanities projects maintain their data collections in the
form of text, images, and metadata. While data may be stored in many formats,
from plain text to XML to relational databases, the use of the resource
description framework (RDF) as a standardized representation has gained
considerable traction during the last five years. Almost every digital
humanities meeting has at least one session concerned with the topic of digital
humanities, RDF, and linked data. While most existing work in linked data has
focused on improving algorithms for entity matching, the aim of the
LinkedHumanities project is to build digital humanities tools that work "out of
the box," enabling their use by humanities scholars, computer scientists,
librarians, and information scientists alike. With this paper, we report on the
Linked Open Data Enhancer (LODE) framework developed as part of the
LinkedHumanities project. With LODE we support non-technical users to enrich a
local RDF repository with high-quality data from the Linked Open Data cloud.
LODE links and enhances the local RDF repository without compromising the
quality of the data. In particular, LODE supports the user in the enhancement
and linking process by providing intuitive user-interfaces and by suggesting
high-quality linking candidates using tailored matching algorithms. We hope
that the LODE framework will be useful to digital humanities scholars
complementing other digital humanities tools
ProcK: Machine Learning for Knowledge-Intensive Processes
We present a novel methodology to build powerful predictive process models.
Our method, denoted ProcK (Process & Knowledge), relies not only on sequential
input data in the form of event logs, but can learn to use a knowledge graph to
incorporate information about the attribute values of the events and their
mutual relationships. The idea is realized by mapping event attributes to nodes
of a knowledge graph and training a sequence model alongside a graph neural
network in an end-to-end fashion. This hybrid approach substantially enhances
the flexibility and applicability of predictive process monitoring, as both the
static and dynamic information residing in the databases of organizations can
be directly taken as input data. We demonstrate the potential of ProcK by
applying it to a number of predictive process monitoring tasks, including tasks
with knowledge graphs available as well as an existing process monitoring
benchmark where no such graph is given. The experiments provide evidence that
our methodology achieves state-of-the-art performance and improves predictive
power when a knowledge graph is available
Exploring semi-supervised methods for labeling support in multimodal datasets
Working with multimodal datasets is a challenging task as it requires annotations which often are time consuming and difficult to acquire. This includes in particular video recordings which often need to be watched as a whole before they can be labeled. Additionally, other modalities like acceleration data are often recorded alongside a video. For that purpose, we created an annotation tool that enables to annotate datasets of video and inertial sensor data. In contrast to most existing approaches, we focus on semi-supervised labeling support to infer labels for the whole dataset. This means, after labeling a small set of instances our system is able to provide labeling recommendations. We aim to rely on the acceleration data of a wrist-worn sensor to support the labeling of a video recording. For that purpose, we apply template matching to identify time intervals of certain activities. We test our approach on three datasets, one containing warehouse picking activities, one consisting of activities of daily living and one about meal preparations. Our results show that the presented method is able to give hints to annotators about possible label candidates
An Incremental Approach to Entity Resolution
We present a query-time entity resolution process that works
in a highly parallel fashion. We use the application MobEx
to showcase our process, which consists of a mobile client
and a server, where the server takes the role of a mediator
and carries out the resolution. Results are propagated to
the client as early as possible. Resolution results that are
produced later in the process are send as updates to the
client and thus improve earlier results
An approach for incremental entity resolution at the example of social media data
When querying data providers on the web, one has no guarantee that they will reply within a given time. Some providers may even not answer at all. This makes it infeasible to wait for a complete result before beginning with the entity resolution. In order to solve this problem, we present a query-time entity resolution approach that takes the asynchronous nature of the replies from data providers into account by starting the entity resolution as soon as first results are returned. Resolved entities are propagated from the entity resolution engine to the mobile client as early as possible. Resolution results that are produced later are send as updates to the client and thus improve earlier results
Challenges in Annotation of useR Data for UbiquitOUs Systems: Results from the 1st ARDUOUS Workshop
Labelling user data is a central part of the design and evaluation of
pervasive systems that aim to support the user through situation-aware
reasoning. It is essential both in designing and training the system to
recognise and reason about the situation, either through the definition of a
suitable situation model in knowledge-driven applications, or through the
preparation of training data for learning tasks in data-driven models. Hence,
the quality of annotations can have a significant impact on the performance of
the derived systems. Labelling is also vital for validating and quantifying the
performance of applications. In particular, comparative evaluations require the
production of benchmark datasets based on high-quality and consistent
annotations. With pervasive systems relying increasingly on large datasets for
designing and testing models of users' activities, the process of data
labelling is becoming a major concern for the community. In this work we
present a qualitative and quantitative analysis of the challenges associated
with annotation of user data and possible strategies towards addressing these
challenges. The analysis was based on the data gathered during the 1st
International Workshop on Annotation of useR Data for UbiquitOUs Systems
(ARDUOUS) and consisted of brainstorming as well as annotation and
questionnaire data gathered during the talks, poster session, live annotation
session, and discussion session
Investigating the Usability of a Mobile App for Finding and Exploring Places and Events
In our two-step field study, we developed and evaluated mobEx, a mobile app for faceted exploration of social media data on Android phones. mobEx unifies the data sources of related commercial apps in the market by retrieving information from various providers. The goal of our study was to find out, if the subjects understood the metaphor of a time-wheel as novel user interface feature for finding and exploring places and events and how they use it. In addition, mobEx offers a grid-based navigation menu and a list-based navigation menu for exploring the data. Here, we were interested in gaining some qualitative insights about which type of navigation approach the users prefer when they can choose between them. In this paper, we present the design and a preliminary analysis of the results of our study